Sensitivity Analysis in Markov Decision Processes with Uncertain Reward Parameters
نویسندگان
چکیده
Sequential decision problems can often be modeled as Markov decision processes. Classical solution approaches assume that the parameters of the model are known. However, model parameters are usually estimated and uncertain in practice. As a result, managers are often interested in how estimation errors affect the optimal solution. In this paper we illustrate how sensitivity analysis can be performed directly for a Markov decision process with uncertain reward parameters using the Bellman equations. In particular, we consider problems involving (i) a single stationary parameter, (ii) multiple stationary parameters, and (iii) multiple nonstationary parameters. We illustrate the applicability of this work through a capacitated stochastic lot-sizing problem.
منابع مشابه
The Inventory System Management under Uncertain Conditions and Time Value of Money
This study develops a inventory model to determine ordering policy for deteriorating items with shortages under markovian inflationary conditions. Markov processes include process whose future behavior cannot be accurately predicted from its past behavior (except the current or present behavior) and which involves random chance or probability. Behavior of business or economy, flow of traffic, p...
متن کاملMulti-Criteria Approaches to Markov Decision Processes with Uncertain Transition Parameters
Markov decision processes (MDPs) are a well established model for planing under uncertainty. In most situations the MDP parameters are estimates from real observations such that their values are not known precisely. Different types of MDPs with uncertain, imprecise or bounded transition rates or probabilities and rewards exist in the literature. Commonly the resulting processes are optimized wi...
متن کاملMulti-Objective Approaches to Markov Decision Processes with Uncertain Transition Parameters
Markov decision processes (MDPs) are a popular model for performance analysis and optimization of stochastic systems. The parameters of stochastic behavior of MDPs are estimates from empirical observations of a system; their values are not known precisely. Different types of MDPs with uncertain, imprecise or bounded transition rates or probabilities and rewards exist in the literature. Commonly...
متن کاملActive Learning in Partially Observable Markov Decision Processes
This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. We propose two formulations of the problem. The first formulation relies on a model of the uncertainty that is added directly into the POMDP planning problem. This has some interesting theoretical properties, but is impr...
متن کامل